Estimating Likelihoods for Topic Models

نویسنده

  • Wray L. Buntine
چکیده

Topic models are a discrete analogue to principle component analysis and independent component analysis that model topic at the word level within a document. They have many variants such as NMF, PLSI and LDA, and are used in many fields such as genetics, text and the web, image analysis and recommender systems. However, only recently have reasonable methods for estimating the likelihood of unseen documents, for instance to perform testing or model comparison, become available. This paper explores a number of recent methods, and improves their theory, performance, and testing. ∗. Prepared for The 1st Asian Conference on Machine Learning, 2009, Nanjing China. The original publication is available at www.springerlink.com. †. Also a fellow at Helsinki Institute of IT.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multilevel and Latent Variable Modeling with Composite Links and Exploded Likelihoods

Composite links and exploded likelihoods are powerful yet simple tools for specifying a wide range of latent variable models. Applications considered include survival or duration models, models for rankings, small area estimation with census information, models for ordinal responses, item response models with guessing, randomized response models, unfolding models, latent class models with rando...

متن کامل

Bayesian experimental design for models with intractable likelihoods.

In this paper we present a methodology for designing experiments for efficiently estimating the parameters of models with computationally intractable likelihoods. The approach combines a commonly used methodology for robust experimental design, based on Markov chain Monte Carlo sampling, with approximate Bayesian computation (ABC) to ensure that no likelihood evaluations are required. The utili...

متن کامل

Estimating Latent-Variable Graphical Models using Moments and Likelihoods

Recent work on the method of moments enable consistent parameter estimation, but only for certain types of latent-variable models. On the other hand, pure likelihood objectives, though more universally applicable, are difficult to optimize. In this work, we show that using the method of moments in conjunction with composite likelihood yields consistent parameter estimates for a much broader cla...

متن کامل

Ranking Intrusion Likelihoods with Exploitability of Network Vulnerabilities in a Large-Scale Attack Model

Network vulnerabilities are common sources of many security threats. Attack models representing chains of all possible vulnerability exploits by attackers can help locate security flaws and pre-determine appropriate preventative measures. To realize the full benefits of attack models, effective analysis is crucial. However, due to the size and complexity of the models, manually pinpointing pote...

متن کامل

Estimation of Return to Scale under Weight Restrictions in Data Envelopment Analysis

Return-To-Scale (RTS) is a most important topic in DEA. Many methods are not obtained for estimating RTS in DEA, yet. In this paper has developed the Banker-Trall approach to identify situation for RTS for the BCC model "multiplier form" with virtual weight restrictions that are imposed to model by DM judgments. Imposing weight restrictions to DEA models often has created problem of infeasibili...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009